33 research outputs found

    High-Throughput SNP Genotyping by SBE/SBH

    Full text link
    Despite much progress over the past decade, current Single Nucleotide Polymorphism (SNP) genotyping technologies still offer an insufficient degree of multiplexing when required to handle user-selected sets of SNPs. In this paper we propose a new genotyping assay architecture combining multiplexed solution-phase single-base extension (SBE) reactions with sequencing by hybridization (SBH) using universal DNA arrays such as all kk-mer arrays. In addition to PCR amplification of genomic DNA, SNP genotyping using SBE/SBH assays involves the following steps: (1) Synthesizing primers complementing the genomic sequence immediately preceding SNPs of interest; (2) Hybridizing these primers with the genomic DNA; (3) Extending each primer by a single base using polymerase enzyme and dideoxynucleotides labeled with 4 different fluorescent dyes; and finally (4) Hybridizing extended primers to a universal DNA array and determining the identity of the bases that extend each primer by hybridization pattern analysis. Our contributions include a study of multiplexing algorithms for SBE/SBH genotyping assays and preliminary experimental results showing the achievable tradeoffs between the number of array probes and primer length on one hand and the number of SNPs that can be assayed simultaneously on the other. Simulation results on datasets both randomly generated and extracted from the NCBI dbSNP database suggest that the SBE/SBH architecture provides a flexible and cost-effective alternative to genotyping assays currently used in the industry, enabling genotyping of up to hundreds of thousands of user-specified SNPs per assay.Comment: 19 page

    Highly Scalable Algorithms for Robust String Barcoding

    Full text link
    String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem

    Identification of mammalian orthologs using local synteny

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accurate determination of orthology is central to comparative genomics. For vertebrates in particular, very large gene families, high rates of gene duplication and loss, multiple mechanisms of gene duplication, and high rates of retrotransposition all combine to make inference of orthology between genes difficult. Many methods have been developed to identify orthologous genes, mostly based upon analysis of the inferred protein sequence of the genes. More recently, methods have been proposed that use genomic context in addition to protein sequence to improve orthology assignment in vertebrates. Such methods have been most successfully implemented in fungal genomes and have long been used in prokaryotic genomes, where gene order is far less variable than in vertebrates. However, to our knowledge, no explicit comparison of synteny and sequence based definitions of orthology has been reported in vertebrates, or, more specifically, in mammals.</p> <p>Results</p> <p>We test a simple method for the measurement and utilization of gene order (local synteny) in the identification of mammalian orthologs by investigating the agreement between coding sequence based orthology (Inparanoid) and local synteny based orthology. In the 5 mammalian genomes studied, 93% of the sampled inter-species pairs were found to be concordant between the two orthology methods, illustrating that local synteny is a robust substitute to coding sequence for identifying orthologs. However, 7% of pairs were found to be discordant between local synteny and Inparanoid. These cases of discordance result from evolutionary events including retrotransposition and genome rearrangements.</p> <p>Conclusions</p> <p>By analyzing cases of discordance between local synteny and Inparanoid we show that local synteny can distinguish between true orthologs and recent retrogenes, can resolve ambiguous many-to-many orthology relationships into one-to-one ortholog pairs, and might be used to identify cases of non-orthologous gene displacement by retroduplicated paralogs.</p

    Optimum Extensions of Prefix Codes

    No full text
    An algorithm is given for finding the minimum weight extension of a prefix code. The algorithm runs in O(n³), where n is the number of codewords to be added, and works for arbitrary alphabets. For binary alphabets the running time is reduced to O(n²), by using the fact that a certain cost matrix satisfies the quadrangle inequality. The quadrangle inequality is shown not to hold for alphabets of size larger than two. Similar algorithms are presented for finding alphabetic and length-limited code extensions

    High-Throughput SNP Genotyping by SBE/SBH

    No full text

    A note on the MST heuristic for bounded edge-length Steiner Trees with minimum number of Steiner Points

    No full text
    We give a tight analysis of the MST heuristic recently introduced by G.-H. Lin and G. Xue for approximating the Steiner tree with minimum number of Steiner points and bounded edge-lengths. The approximation factor of the heuristic is shown to be one less than the MST number of the underlying space, defined as the maximum possible degree of a minimum-degree MST spanning points from the space. In particular, on instances drawn from the Euclidean (resp. rectilinear) plane, the MST heuristic is shown to have tight approximation factors of 4, respectively 3. Keywords: Approximation algorithms, Steiner trees, MST heuristic, fixed wireless network design, VLSI CAD. 1 Introduction The classical Steiner tree problem is that of finding a shortest tree spanning a given set of terminal points. The tree may use additional points besides the terminals, these points are commonly referred to as Steiner points. In the Minimum number of Steiner Points Tree (MSPT) problem [7,5] one also seeks a tree ..

    Guest Editors’ Introduction to the Special Section on Bioinformatics Research and Applications

    No full text

    Complete mitochondrial genome of the water vole, Microtus richardsoni (Cricetidae, Rodentia)

    No full text
    Water voles (Microtus richardsoni) are sensitive species distributed in the mountains of Canada (Alberta, British Columbia), and the United States of America (Idaho, Montana, Oregon, Utah, Washington, and Wyoming). We assembled the complete circular M. richardsoni mitogenome, which is 16,285 bp in length and encodes 13 protein-coding genes, 22 tRNA genes, and two rRNA genes. We estimated the phylogenetic tree of M. richardsoni and 24 related arvicoline species with two outgroup species: Phodopus roborovskii and Cricetus cricetus
    corecore